Fix excessive thread name lookups by tdryer · Pull Request #790 · benfred/py-spy

tdryer · 2025-07-29T18:23:37Z

py-spy uses a cache to minimize the overhead of looking up Python thread names during sampling. If looking up a thread name fails, the entire cache is rebuilt. However, if a failure is persistent, this causes the cache to be rebuilt during every sample (potentially more than once), creating excessive overhead.

Two cases when persistent thread name lookup failures can occur are:

When the threading module has not been imported
When a thread was created outside of the threading module, such as through _thread.start_new_thread

Fix this by modifying the cache to store a negative response (None) if the thread name is still not found after rebuilding the cache. To prevent negative responses from being lost when more than one thread name lookup fails, add a flag preventing the cache from being rebuilt more than once per sample.

One disadvantage of this approach is that in nonblocking mode, a spurious negative response can be cached for an extended duration. This happens if the thread is created during a sample and after the single allowed thread name lookup happens.

py-spy uses a cache to minimize the overhead of looking up Python thread names during sampling. If looking up a thread name fails, the entire cache is rebuilt. However, if a failure is persistent, this causes the cache to be rebuilt during every sample (potentially more than once), creating excessive overhead. Two cases when persistent thread name lookup failures can occur are: * When the `threading` module has not been imported * When a thread was created outside of the `threading` module, such as through `_thread.start_new_thread` Fix this by modifying the cache to store a negative response (`None`) if the thread name is still not found after rebuilding the cache. To prevent negative responses from being lost when more than one thread name lookup fails, add a flag preventing the cache from being rebuilt more than once per sample. One disadvantage of this approach is that in nonblocking mode, a spurious negative response can be cached for an extended duration. This happens if the thread is created during a sample and after the single allowed thread name lookup happens.

cschanhniem · 2026-06-19T07:09:30Z

I traced the code path. The issue is in python_spy.rs in _get_python_thread_name (around line 330):

fn _get_python_thread_name(&mut self, python_thread_id: u64) -> Option<String> {
    match self.python_thread_names.get(&python_thread_id) {
        Some(thread_name) => Some(thread_name.clone()),
        None => {
            self.python_thread_names = thread_name_lookup(self).unwrap_or_default();
            self.python_thread_names.get(&python_thread_id).cloned()
        }
    }
}

Two compounding problems:

1. O(N^2) per sample: _get_python_thread_name is called inside the thread loop in _get_stack_traces, once per PyThreadState. When N threads are uncached, the full thread_name_lookup (which traverses the entire threading._active dict) runs N times per sample instead of once.

2. No negative caching: When thread_name_lookup fails (threading module not imported, or threads created outside threading), unwrap_or_default() returns an empty HashMap. Every subsequent thread in the same sample — and every sample after — also misses, triggering the same failing lookup.

Fix: move the lookup out of the per-thread function and into _get_stack_traces, running it once per sample call, and add a bool flag to skip re-attempt after the first failure:

// Add to PythonSpy struct:
pub thread_name_lookup_attempted: bool,

// In _get_stack_traces, before the thread loop:
if !self.thread_name_lookup_attempted {
    self.python_thread_names = thread_name_lookup(self).unwrap_or_default();
    self.thread_name_lookup_attempted = true;
}

// Simplified _get_python_thread_name:
fn _get_python_thread_name(&mut self, python_thread_id: u64) -> Option<String> {
    self.python_thread_names.get(&python_thread_id).cloned()
}

This ensures the full dict walk happens exactly once per sample (or once ever if the threading module is absent). The stale-cache issue from recycled TIDs is already handled in _get_os_thread_id which clears python_thread_names when it detects a recycled TID, so correctness is preserved.

tdryer and others added 2 commits July 29, 2025 11:19

Merge branch 'master' into thread-name-lookups

2efdf43

benfred added the enhancement New feature or request label Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix excessive thread name lookups#790

Fix excessive thread name lookups#790
tdryer wants to merge 2 commits into
benfred:masterfrom
tdryer:thread-name-lookups

tdryer commented Jul 29, 2025

Uh oh!

cschanhniem commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tdryer commented Jul 29, 2025

Uh oh!

cschanhniem commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants